



## **PROBABILITY-DRIVEN MULTIBIT FLIP-FLOP INTEGRATION WITH CLOCK GATING**

***'BONIGALA SANTHI, 'B LAKSHMI***

*'PG Scholar, Dept. of ECE, NOVA COLLEGE OF ENGINEERING AND TECHNOLOGY, Vijayawada, A.P*

*<sup>2</sup>Associate Professor, Dept. of ECE, NOVA COLLEGE OF ENGINEERING AND TECHNOLOGY, Vijayawada, A.P*

**ABSTRACT:** The major dynamic power consumers in computing and consumer electronics products is the system's clock signal, typically responsible for 30%-70% of the total dynamic power consumption. Clock gating is a predominant technique used for power saving. It is observed that the commonly used synthesis based gating still leaves a large amount of redundant clock pulses. Data-driven gating aims to disable these. To reduce the hardware overhead involved, flip-flops (FFs) are grouped so that they share a common clock enabling signal. Data-driven clock gating is employed for FFs at the gate level. The clock signal driving a FF is disabled when the FFs state is not subject to change in the next clock cycle. Data-driven gating is causing area and power overheads that must be considered. In an attempt to reduce the overhead, it is proposed to group several FFs to be driven by the same clock signal, generated by ORing the enabling signals of the individual FFs. Pseudorandom bit generators (PRBGs) are widely used in many electronic equipment, thus many researchers are proposing novel solution addressed to improve the inviolability performances required in cryptographic applications. LFSR is the most used topology to implement PRBG. In this paper, a method to reduce the power consumption of the popular linear feedback shift registers. The proposed scheme is based on the gated clock design approach and it can offer a significant power reduction.

**KEYWORDS:** Linear Feed Back Shift Register, Flip-Flop, Pseudorandom bit generators, Data-driven.

**INTRODUCTION:** The increasing demand for low power mobile computing and consumer electronics products has refocused VLSI design in the last two decades on lowering power and increasing energy efficiency. Power reduction is treated at all design levels of VLSI chips. From the architecture through block and logic levels, down to gate level circuit and physical implementation, one of the major dynamic power consumers in the system clock signal, typically responsible for up to 50% of the total dynamic power consumption. Clock network design is a delicate procedure, and is therefore done in a very conservative manner under worst case assumptions. It incorporates many diverse aspects such as selection of sequential

**Copyright @ 2020 ijearst. All rights reserved.**

**INTERNATIONAL JOURNAL OF ENGINEERING IN ADVANCED RESEARCH  
SCIENCE AND TECHNOLOGY**

**Volume.02, IssueNo.11, December -2020, Pages: 272-282**

elements, controlling the clock skew, the decision of the topology and physical implementation of the clock distribution network. Several techniques to reduce the dynamic power have been developed, of which clock gating is predominant. Ordinarily, when a logic unit is clocked, its underlying sequential elements receive the clock signal regardless of whether or not they will toggle in the next cycle. Clock enabling signals are usually introduced by designers during the system and clock design phases, where the inter-dependencies of the various functions are well understood. In contrast, it is very difficult to define such signals in the gate level, especially in control logic, since the inter-dependencies among the states of various flip-flops depend on automatically synthesized logic. There is a big gap between block disabling that is driven from the HDL definitions, and what can be achieved with data knowledge regarding the flip-flops activities and how they are correlated with each other. The research presents an approach to maximize clock disabling at the gate level, where the clock signal driving a flip-flop is disabled (gated) when the flip-flop state is not subject to a change in the next clock cycle. With the invention of Integrated Circuit, an influential trend started; which is to accommodate as many components as possible in the same amount of silicon area. In a span of 4 decades the world has moved from Small Scale Integration to Very Large Scale Integration and all this is possible due to a streamlined semiconductor process technology which has been improving continuously over the last forty years[6].

**LITERATURE SURVEY:** Several dynamic power management techniques are adopted in VLSI circuits out of which the major one is clock gating.[5] uses multiple supply voltages to reduce clock tree power. The incoming, high voltage clock signal is down-scaled by means of a low-voltage buffer stage. The low- $V_{dd}$  signal is then propagated throughout the circuit, and regenerating elements (e.g., buffers) are inserted into the tree structure to ensure the appropriate speed and slew rate of the transitions. Finally, the original high-voltage is restored through levelshifters before the clock signals feed the flip-flops. In [3] Clock Distribution using Multiple Voltages reduces the cost of buffering and voltage converters that is essential in the power reduction technique implemented using multiple supply voltages. The approach presented by Pangjung and Sapatnekar addresses this limitation by providing a more sophisticated algorithm for introducing buffers into the clock tree and for placing the low-to-high voltage shifters, which are now not necessarily located right in front of the flip-flops. The algorithm considers the possibility of buffer insertion after every step of bottom-up sub tree merging. In the interest of keeping the skew very close to zero, the algorithm guarantees that the number of regenerating elements is equalized along any root-to-sink paths of the tree. However, in spite of the solid theoretical basis of this solution, experimental results showed very small differences with the clock trees generated by the approach using multiple supply voltages. In [2] focuses on Interconnect Power, i.e. energy dissipation due to the switching of interconnection capacitances, which are part of the total switched capacitance of each net-  $C_j$ . Applying wire capacitance reduction techniques to a small

percentage of the wires can save the majority of the interconnect power. Capacitance can be reduced by interconnect reduction and increasing interconnect spacing thereby reducing capacitance and thereby reducing power dissipation. In [4], a review of some existing techniques available for clock gating is presented. Also a new technique that provides more immunity to the existing problems in available techniques is discussed. A. Raychowdhury et al. introduced multi-valued logic design using carbon nanotube field-effect transistors and comparison in terms of power and performance was done. The power-delay product was seen to be 22% more in complementary adder than in ternary adder in 2005 [1]. R.S.Shelar examined the clocks as major source of power consumption in digital circuits.

### EXISTING TECHNIQUE:



Fig1. DDCG integrated into a  $k$ -MBFF.

- 1) a design methodology that fuses MBFF and DDCG, yielding considerable power savings;
- 2) a probability-driven algorithm that minimizes the expected DDCG MBFF power consumption.

### INTEGRATING CLOCK GATING INTO MBFF

Let  $p$  be the data-to-clock toggling probability. The expected energy  $E1$  consumed by a 1-bit FF is  $E1(p) = \lambda 1 + \mu 1 p$  (1) where  $\lambda 1$  is the energy of the FF's internal clock driver and  $\mu 1$  is the energy of data toggling. In the general case of  $k$ -MBFF, let  $\lambda k$  be the energy of the MBFF's internal clock driver and  $\mu k$  its per-bit data toggling energy. Assume that the FFs toggle with probability  $p$  independently of each other. It has been shown in [14] that the expected energy is

$Ek(p) = k \sum_{j=0}^{k-1} (\lambda k + \mu k) - k j p j (1 - p) k - j = \lambda k + \mu k p$ . (2) It is important to note that toggling independence is a pessimistic assumption. In reality, the correlation between FF toggling yields higher energy savings than the model in [2]. The ratio  $(kE1(p) - Ek(p))/kE1(p)$  expresses the energy saving potential of  $k$ -MBFF. The coefficients  $\lambda$  and  $\mu$  of the 65-nm 1-MBFF, 2-MBFF, and 4-MBFF were derived with SPICE simulations. Zero activity ( $p = 0$ ) yields 35% savings for the 2-MBFF and 55% savings for the 4-MBFF, whereas full activity ( $p = 1.0$ ) yields 15% savings for the 2-MBFF and 23% savings for the 4-MBFF. In typical VLSI systems, the

average  $p$  does not exceed 0.1, so high savings are achievable.  $k$  that maximizes the energy savings solves the equation  $(1 - p)k \ln(1 - p)CFF + Clatchk2=0$  (3) where  $CFF$  and  $Clatch$  are the clock input loads of an FF and a latch, respectively [2]. The solution to (3) for various activities is shown in Table I for typical  $CFF$  and  $Clatch$ . The above optimization does not take into account the clock driver sharing, which also affects the optimal grouping as shown below. To grasp the power savings of a  $k$ -MBFF achievable by DDCG, Figure was simulated with SPICE for various activities  $p$  and  $k = 2, 4, 8$ . Figure shows the power consumption of a 2-MBFF line (a) is the power consumed by two 1-bit FFs driven independently of each other. The  $3.8\text{-}\mu\text{W}$  power at zero activity is due to the toggling of the clock driver at each FF, and it is always consumed regardless of the activity.

## MEMORY ORGANIZATION:

**DELAY BUFFERS:** This section describes PJMEDIA's implementation of delay buffer. Delay buffer works quite similarly like a fixed jitter buffer, that is it will delay the frame retrieval by some interval so that caller will get continuous frame from the buffer. This can be useful when the operations are not evenly interleaved, for example when caller performs burst of `put()` operations and then followed by burst of `get()` operations. With using this delay buffer, the buffer will put the burst frames into a buffer so that `get()` operations will always get a frame from the buffer (assuming that the number of `get()` and `put()` are matched). The buffer is adaptive, that is it continuously learns the optimal delay to be applied to the audio flow at run-time. Once the optimal delay has been learned, the delay buffer will apply this delay to the audio flow, expanding or shrinking the audio samples as necessary when the actual audio samples in the buffer are too low or too high.



Fig2 : Buffer

## GENERAL TECHNIQUE:



Fig3 : Existing Block Of Memory Organization



Fig 4 : Ring Counter With SR Flip-Flops

The above block diagram shows the power controlled Ring counter. First, total block is divided into two blocks. Each block is having one SR FLIPFLOP and controller with SR flip-flop, and gate, Clock distribution circuit.

## PROPOSED TECHNIQUE:



Fig5: Block Diagram For Proposed Delay Buffer

**GATED DRIVER TREE:****Fig6: Gated Driver Tree**

Gated driver tree derived from the same clock gating signals of the blocks that they drive. Thus, in a quad-tree clock distribution network, the “gate” signal of the gate driver at the level (CKE ) should be asserted when the active DET flip-flop

**MODIFIED RING COUNTER:****Fig7: Modified Ring Counter**

### DET (Double edge triggered flip-flops:

Double-edge-triggered (DET) flip-flops are utilized to reduce the operating frequency by half. The logic construction of a double-edge-triggered (DET) flip-flop, which can receive input signal at two levels the clock, is analyzed and a new circuit design of CMOS DET. In this paper, we propose to use double-edge-triggered (DET) flip-flops instead of traditional DFFs in the ring counter to halve the operating clock frequency. Double edge-triggered flipflops are becoming a popular technique for low-power designs since they effectively enable a halving of the clock frequency. The paper by Hossain et al[1] showed that while a single-edge triggered flipflop can be implemented by two transparent latches in series, a double edge-triggered flipflop can be implemented by two transparent latches in parallel; the circuit in Fig. 1 was given for the static flipflop implementation. The clock signal is assumed to be inverted locally. In high noise or low-voltage environments, Hossain et al noted that the p-type pass-transistors may be replaced by n-types or that all pass-transistors may be replaced by transmission gates.

**C ELEMENT:** The Muller C-element, or Muller C-gate, is a commonly used asynchronous logic component originally designed by David E. Muller. It applies logical operations on the inputs and has hysteresis. The output of the C-element reflects the inputs when the states of all inputs match. The output then remains in this state until the inputs all transition to the other state. This model can be extended to the Asymmetric C-element where some inputs only effect the operation in one of the transitions (positive or negative). The figure shows the gate-level and transistor-level implementations and symbol of the C-element.



**Fig8: C Element**

The C-element stores its previous state with two cross-coupled inverters, similar to an SRAM cell. One of the inverters is weaker than the rest of the circuit, so it can be overpowered by the pull-up and pull-down networks. If both inputs are 0, then the pull-up network changes the latch's state, and the C-element outputs a 0. If both inputs are 1, then the pull-down network changes the latch's state, making the C-element

output a 1. Otherwise, the input of the latch is not connected to either V or ground, and so the weak inverter (drawn smaller in the diagram) dominates and the latch outputs its previous state.

## RESULT:



## CONCLUSION:

Clock gating is used in fifo to reduce the power consumption. For further power saving data driven clock gating and multibit flip-flops are used in sequential circuits. Common clock gating is used for power saving. But clock gating still leaves larger amount of redundant clock pulses. Multibit flip-flop is also used to reduce power consumption. Using of Multibit Flip-Flop method is to eliminate the total inverter number by sharing the inverters in the flip-flops. Combination of Multibit Flip-Flop with Data driven clock gating will increase the further power saving.

**REFERENCES :**

- [1] A. Raychowdhury and K. Roy, "Carbon-Nanotube-Based Voltage-Mode Multiple-Valued Logic Design", IEEE Transactions on Nanotechnology, Volume 4, Number 2, pp.168-178, 2005.
- [2] R.S.Shelar, "A Fast and Near-Optimal Clustering Algorithm for Low-Power Clock Tree Synthesis", IEEE Transactions On Computer-Aided Design Of Integrated Circuits And Systems, Volume 31, Number 11, pp.1781-1786, 2012.
- [3] S.Wang et al., "Power-Driven Flip-Flop Merging and Relocation", IEEE Transactions On Computer-Aided Design Of Integrated Circuits And Systems, Volume 31, Number 2, pp.180-191, 2012.
- [4] M.P.Lin et al., "Post-Placement Power Optimization with Multi-Bit Flip-Flops", IEEE Transactions On Computer-Aided Design Of Integrated Circuits And Systems, Volume 30, Number 12, pp.1870-1882, 2011.
- [5] I.H.Jiang et al. "INTEGRA: Fast Multibit Flip-Flop Clustering for Clock Power Saving", IEEE Transactions On Computer-Aided Design Of Integrated Circuits And Systems, Volume 31, Number 2, pp.192-204, 2012.
- [6] Y.T.Shyu et al., "Effective and Efficient Approach for Power Reduction by Using Multi-Bit Flip-Flops", IEEE Transactions On Very Large Scale Integration Systems, Volume 21, Number 4, pp.624-635, 2013.
- [7] U.K.Malviya and V.Tripathi, "Design and Implementation of Multi Level Logic for Digital System", ISSN:2278-7798 International Journal of Science, Engineering and Technology Research, Volume 2, Number 7, pp. 1464- 1468, 2013.
- [8] B. Choi, "Advancing from Two to Four Valued Logic Circuits", Proc. IEEE International Conference on Industrial Technology, Volume 13, pp.1057-1062, 2013.
- [9] B. Choi and K.Shukla, "Multi-Valued Logic Design and Implementation", International Journal of Electronics and Electrical Engineering, Volume 3, Number 4, pp.256-262, 2015.